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SUMMARY 


The  application  of  a  class  of  continuous,  one-sided,  three-parameter 
probability  distributions  is  being  considered.  The  parameters  represent  scale 
and  initial  and  terminal  shape  of  the  associated  probability  density  function. 
The  class  contains  as  special  cases  (for  specific  numerical  values  of  the 
shape  parameters)  the  following  well-known  distributions:  Gauss,  Welbull, 
exponential,  Rayleigh,  Gamma,  chi-square.  Maxwell,  and  Wien.  The  objective  is 
to  present  and  discuss  a  parameter  determination  technique  which  uses  cumula¬ 
tive  frequency  data.  The  approach  is  based  on  an  applicability  criterion  for 
the  considered  distribution  class  which  provides  the  opportunity  to  determine 
the  parameter  values  by  means  of  three  equations  derived  from  the  first  and 
second  moments,  and  an  analytical  approximation  of  the  logarithm  of  the  cumu¬ 
lative  distribution  function.  Since  the  scale  parameter  can  be  eliminated, 
the  parameter  determination  process  requires  the  iterative  solution  on  a  per¬ 
sonal  computer  (PC)  of  only  two  equations.  Convergence  of  the  iteration  pro¬ 
cess  provides  the  ultimate  practical  justification  for  the  applicability  of 
the  considered  distribution  class  relative  to  given  empirical  data.  Examples 
are  given  to  verify  the  efficiency  of  the  proposed  parameter  determination 
method. 


I 
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I .  INTRODUCTION 


The  objective  Is  to  revitalize  Interest  In  the  application  of  a  class  of 
probability  distributions  which  had  been  designated  a  generalized  Gamma 
distribution  by  various  authors  [1,  2,  3,  4,  and  S).  This  class  represents 
three-parameter,  continuous,  one-sided  distributions  which  may  be  defined  In 
terms  of  the  cumulative  distribution  function  (cdf) 

I  V  ..  f  -  *b-l,  X  >  0,  (*) 

F(x)=  r((i-p>x-l) 

I  0,  X  <  0, 

with  parameters  b,  p,  and  R,  r(y)  and  y(a,y)  being  the  Gamma  function  and  the 
Incomplete  Gamma  function  (with  lower  Integration  limit  zero),  respectively. 

Apparently,  this  class  of  distributions  was  Introduced  originally  by 
L.  Amoroso  [6].  Various  aspects  of  It  received  attention  In  fairly  recent 
publications  [7,  8].  These  papers  refer  to  the  close  connection  of  the  class 
(sjc),  via  the  associated  probability  density  function  (pdf),  with  a  class  of 
parabolic  differential  equations  (generalized  Feller  equation).  They  also 
establish  a  connection  with  the  underlying  dynamical  diffusion  process.  In 
this  context  the  publication  [9,  Sec.  7]  may  be  of  particular  interest. 

The  probability  density  function  (pdf)  class  associated  with  the  cdf  class 
(:|e)  is  given  by 


f(x) 


dF(x) 

dx 


I  -  b”^  ^"P  exp  r  »  xb"^,  X  >  0, 

•!  r((l-p)R-l) 


The  expression  for  f(x)  clearly  demonstrates  the  meaning  of  the  parameters 
b,  p,  and  R.  The  parameter  b  >  0  represents  scale,  p  <  1  represents  Initial 
shape  (for  small  values  of  x  >  0)  and  8  >  0  represents  terminal  shape  (for 
large  values  of  x). 

A  shift  parameter  Xq  may  be  introduced  by  replacing  x  by  x-Xq,  x  >  Xq. 

That  will  not  be  done  here,  however,  since  only  distributions  of  the  three- 
parameter  type  (J|c),  (1),  are  of  Interest.  To  partially  lift  the  restrictions 
on  p  and  R,  one  may  replace  the  independent  variable  x  by,  say,  y“^,  y  >  0 
being  a  new  Independent  variable;  however,  this  possibility  will  not  be  of 
further  concern  here.  Another  remark  concerns  a  notatlonal  change  relative  to 
the  earlier  papers  [7,  8,  9].  The  parameter  X  which  appeared  there  has  been 
replaced  by  R  »  1  -  X. 
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The  reason  for  the  designation  of  p  and  B  as  initial  and  terminal  shape 
parameters,  respectively,  Is  evident.  For  large  values  of  x,  the  exponential 
function  In  (1)  Is  the  dominating  factor  and,  consequently,  the  shape  of  the 
pdf  curve  or,  more  precisely.  Its  rate  of  decay,  for  large  values  of  x  Is 
determined  by  B.  In  any  case,  f(x)  +  0  as  x  t  +  «.  Since  the  exponential 
function  approaches  unity  as  x  +  0,  the  Initial  shape  of  the  pdf  curve  Is 
determined  by  p.  If  0  <  p  <  1,  f(x)  +  +  ®  as  x  4-  0  so  that,  in  this  case,  a 
J-shaped  distribution  Is  being  dealt  with.  If  p  -  0,  f(x)  +  B/br(br( B"^ ) ;  the 
distribution  Is  of  the  half  bell-shaped  type  (purely  exponential).  Finally, 

If  p  <  0,  f(x)  4  0  as  X  4  0.  The  distribution  Is  hump-shaped,  the  pdf  having 
a  unique  maximum  at  the  point  x^  ■  b(-pB~^)^'®. 

For  particular  values  of  the  shape  parameters,  the  class  of  distributions 
characterized  by  (♦)  contains  a  number  of  special  cases  well-known  In  sta¬ 
tistics  and  statistical  physics.  The  major  ones  are:  [2,  7,  10]: 

Gauss  (p  ■  0,  B  ■  2), 

Welbull  (p  »  1  -  8  <  1), 

exponential  (p”l-6*0), 

Rayleigh  (p  -  1  -  B  *  -  1), 

Gamma  (p  <  1,  6  ■  1), 

chi-square  (p"(2-v)/2<l,  8*1), 

Maxwell  (p  -  -  2,  B  *  2,  x  -  vt©,  b  ■  (2kT/m)^/2tQ) ,  and 

Wien  (p*-3,  8*1,  X*  2ttcu>q~^ij3,  b  ■  2Trctao””kTh“^ . 

Apparently,  application  of  the  distribution  class  (♦)  has  been  severely 
limited,  although  various  attempts  have  been  made,  [1,  2,  3,  4,  and  11]  for 
the  special  cases  of  Gamma  and  Welbull  to  formalize  and  standardize  the  para¬ 
meter  estimation  process.  In  fact,  the  distribution  class  (:((),  has  not  been 
used  as  extensively  In  every  day  statistical  practice  as  It  should  be.  The 
main  reason  for  this  state  of  affairs  Is  most  likely  attributed  to  com¬ 
putational  Intensity  and  possibly  to  convergence  problems  arising  In  the 
numerical  solution  of  the  associated  maxlmum-llkellhood  equations.  This 
report  will  not  deal  further  with  questions  related  to  the  maxlmum-llkellhood 
approach.  This  will  be  done  elsewhere  In  a  separate  publication. 

From  an  application  point  of  view,  to  revitalize  Interest  In  the  distri¬ 
bution  class  (:ic)  means  to  provide  a  practically  useful,  efficient,  and  com¬ 
putationally  economical  technique  for  the  determination  of  the  three  unknown 
parameters  b,  p,  and  B  relative  to  given  frequency  data.  Practical  usefulness 
Implies  the  notion  of  a  criterion  being  involved  whose  satisfaction  can  be 
verified  in  the  application  of  the  technique.  The  parameter  determination 
technique  that  Is  being  proposed  here  does  involve  such  a  criterion.  It  is 
based  on  an  applicability  criterion,  announced  already  [7],  which  Is  charac¬ 
teristic  for  the  distribution  class  (s(e).  This  criterion,  which  will  be  pre¬ 
sented  In  Section  II,  recognizes  the  fact  that  the  logarithm  of  the  cdf, 

Is  asymptotically  linear  In  log  x  as  x  approaches  zero  from  above.  This 
typical  property  of  the  class  (:(c),  can  be  exploited  to  establish  one  equation 
In  the  three  unknowns  b,  p,  and  8,  which  encompasses  the  cumulative  frequency 
data.  Two' more  equations  In  the  three  unknowns  are  obtained  from  the  first 
and  second  moments  which  can  be  numerically  determined  from  the  relative  fre¬ 
quency  data.  Since  the  scale  parameter  b  can  easily  be  eliminated  by  means  of 
the  first  moment,  two  equations  are  eventually  left  In  the  unknowns  p  and  B. 
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The  equation  resulting  from  the  log  cdf  function  Is  too  complicated  to 
be  used  directly.  Therefore,  It  will  be  replaced  by  a  simpler  approximating 
function  which  will  subsequently  be  used  for  a  least  squares  fit  of  the  given 
log  cdf  points.  The  quality  of  this  approximation  will  be  discussed  In 
Appendixes  6  and  C. 

The  solution  of  the  two  final  equations  for  the  two  unknown  parameters  p 
and  R  proceeds  by  iteration  (Section  V).  Convergence  of  the  Iteration  process 
provides  the  ultimate  practical  justification  for  the  application  of  the 
distribution  class  (:|e),  relative  to  given  empirical  uata  (Appendix  D) . 

A  number  of  examples  are  presented  In  Section  VI.  These  are  "synthetic" 
examples  In  the  sense  that  their  parameter  values  are  known  In  advance  and 
then  reconstructed  by  means  of  the  proposed  parameter  determination  method.  A 
quality  test  Is  Immediately  available  by  means  of  comparison  of  the  original 
and  the  calculated  parameter  values.  One  empirical  example  has  been  included 
for  purposes  of  exposition  and  demonstration.  No  attempt  will  be  made  in  this 
report  to  do  a  goodness-of-f It  test.  This  will  be  left  to  another  publication 
which  will  deal  exclusively  with  empirical  examples. 

While  work  on  this  project  was  In  progress  and  during  Its  publication 
phase,  parallel  efforts  on  maximum-likelihood  density  estimation  for  the 
hyper-Gamma  class  have  led  to  essential  new  results  [14]  which  cover  both  the 
three-  and  four-parameter  cases.  Although  computer  programming  via  the 
maxi mum-1 Ike 11 hood  approach  Is  more  complex  than  that  required  by  the  tech¬ 
nique  presented  In  this  report,  maximum-likelihood  density  estimation  may  be 
preferable  In  practice.  Nevertheless,  the  method  presented  here  leads  quickly 
to  approximate  parameter  values  which  may  be  used  as  Initial  values  In 
maximum-likelihood  estimations. 
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II.  NOTATIONS  AND  FORMULAS 


In  statistical  practice,  empirical,  data  are  normally  given  In  terms  of 
absolute  frequencies,  f^,  relative  to  a  finite  number  m  of  class  Intervals, 
[*v-l»  •••»  The  Intervals  are  assumed  to  be  of  equal  length, 

d  “  Xy  -  Xy_2,  so  that  x^  ■  vd,  and  Xq  -  0. 

The  (piecewise  constant)  absolute  frequency  function  fa(x),  x6[0,  x^) , 

Is  defined  as  fgCx)  ■  fa(*v-l)  *€[Xy_i,  Xy).  A  relative  frequency 

function.  fr(x),  x€[0,Xni)  can  now  be  defined  as  fi-(x)  ■  N“lfa(x),  N  being 
the  total  number  of  observations,  l.e., 
m 

N  -J^faCxy). 
v-l 

With  fj.(x)  one  associates  the  (empirical )  pdf  f(x)  ■  d“^fr(x),  x€[0,  x^.).  A 
major  problem  In  statistical  analysis  arises  In  the  attempt  to  construct  a 
continuous  analogue  of  a  given  (piecewise  constant)  empirical  pdf.  The  main 
objective  of  the  work  to  be  presented  In  this  report  deals  with  a  new  approach 
to  the  solution  of  this  problem  within  the  class  of  distributions  (:{«). 

The  (empirical )  cdf  associated  with  given  frequency  data  Is  defined  as  a 
continuous  and  piecewise  linear  function  F(x)  with  functional  values  at  x  »  0 
and  at  the  Interval  endpoints  given  by 

F(0)  -  0, 

V  V 

f(Xv)  “  ^  f(x„_2)d  -  ^  fr(xy-2)(v  ■  1,  ...,  m).  (2) 

u-1  u»l 

The  set  of  m  class  Intervals,  [Xy_|,  Xy),  Xq  ■  0,  Xy  »  vd  ( v  ■  1 ,  ...  m) 
together  with  the  cdf  values,  F(xy),  as  defined  In  (2),  shall  be  called  an 
emplclcal  data  set. 

The  (theoretical )  moments  of  the  distribution  class  (Hf.)  are  given  by  the 
formula 


M, 


x'’f  (x)dx 


r((vl-l-p)ff-l) 

r((i-p)R-i) 


(v  -  0,  1,  2, 


). 


(3) 


f(x)  given  by  (1),  Mq  ■  1,  Mj  *  u  being  the  mean  value,  and  M2  being  the  mean 
square  value. 

Observe  the  Important  Inequality 


u2 

0  <  -li—  <  1 

M2 


(4) 


A 


which  follows  from 


0  <  J*  (x-u)2f (x)dx  “  M2  “ 


Replacement  of  f(x)  In  (3)  by  the  empirical  pdf  yields  the  (empirical ) 
first  and  second  moments, 


m 


Ml  -  d  fr(Xv)(v  -  1/2), 

V“1 


(5) 


M2  -  d2  ^2  fr(X\j)(v(y“l)  +  1/3). 
v-1 


(6) 


5 


III.  THE  APPLICABILITY  CRITERION 


Now  return  to  the  cdf  class,  F(x),  given  In  (sje).  By  means  of  the  defi¬ 
nition  of  the  Incomplete  Gamma  function  Y(a,  y)  in  terms  of  the  degenerate 
hyper-geometric  function  $(.,.;•)  (12;  9.236.4]  the  nontrivial  part  of  F(x) 
can  be  represented  in  the  form 

F(x) - -  £l-P<fr((l-p)6-l,  1  +  (l-p)8-l;  - 

r(l+(l-p)e'l) 

^  a  Xb“^  » 

This  allows  a  useful  expression  for  the  logarithm  of  F(x)  to  be  obtained: 

log  F(x)  »  -  log  r(l  +(l-p)ft~^)  +  (l-p)log  xb“^ 

+  log  $  ((l-p)ft-l.  1  +  (l-p)6“l;  -  (xb-l)ft).  (7) 

The  Independent  variable  transformation  x  *  Mjy  is  carried  out.  The  reason 
for  this  transformation  is  that,  for  a  given  empirical  data  set,  all  Interval 
endpoints  x^  *  vd  with  Xy  <  will  be  transformed  Into  points  y^  with 
0  <  yy  <  1,  so  that  the  corresponding  numbers  Uy  •  log  yy  *  log  x  will 

be  negative.  (In  some  cases  where  there  are  only  a  few  points  Xy  <  ,  It  may 

be  better  to  transform  x  into  y  by  means  of  a  factor  ic,  <  <  £  x^,.  In  any 
case,  from  a  practical  point  of  view,  as  will  be  seen  shortly.  It  is  essential 
to  have  "sufficiently"  many  numbers  Uy  *  log  yy  “  log  Xyic“^  with  Uy  <  0.) 

With  log  y  *  u  and  log  F(x)  «  log  F(Miy)  *  log  FCMie'^  •  v(u),  so  that  log  x 
b"^  *  log  yMib“^  *  u  -  log  the  functional  relation 

v(u)  “  (l-p)u  -  log  r(l+(l-p)‘’rO-(l-p)  log  M]^“^b 

+  log  <Ii  ((l-p)6~^,  1  +(l-p)B“^;  -  (M^b"le“)^)  (8) 

is  obtained  from  (7).  The  function  $  is  represented  as  a  power  series  in  its 
last  argument  with  constant  term  equal  to  unity.  Therefore,  as  x  4-  0,  l.e., 
as  y  4-  0,  which  means  as  u  4^  -  ®,  log  $  +  0.  (For  the  argument  of  9  In  (8) 
the  series  is  alternating  and,  hence,  0  <  $  <  1.)  Consequently,  the  function 
v(u)  given  in  (8)  Is  asymptotically  linear  In  u  as  u  4-  -  In  other  words, 

v(u)  ~  V3(u)  =■  (l-p)u-log  r(l+(l-p)  R”^)  -  (1-p)  log  Mi”^b,  u  4-  - 

This  asymptotic  linearity  property  may  also  be  expressed  by  saying  that,  as 
u  4-  -  «,  the  graph  of  the  function  v(u)  approaches  the  (straight  line)  asymp¬ 
tote  determined  by  the  equation 

Va(u)  -  (l-p)u  -  log  r(l+(l-p)R“^)  -  (1-p)  log  Mi"^b. 
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Here  and  In  (8)  the  scale  parameter  b  may  be  eliminated  by  uiedns  of  the  first 
moment,  using  (3)  for  v  *  1, 


b  =  Mj  r( ( ^~p) 8  M  » 
r((2-p)fi-i) 


which  leads  to 

Va(u)  “  (l-p)u-log(l-p)8"l-(2-p)logr((l-p)8“l) 

+(l-p)logr((2-p)fl-l).  (9) 

Obviously,  the  graph  of  the  function  v(u)  has  a  second  asymptote,  namely  the 
line  V  =  0  four  u  +  =  This  one,  however.  Is  of  no  further  interest. 

Based  on  the  asymptotic  linearity  property  of  the  function  v(u),  one  can 
formulate  the  following  applicability  criterion  which  has  been  announced 
already  In  [7]: 

A  distribution  function  F(x)  of  the  class  (^jt)  may  be  considered  as  a 
candidate  for  a  data  fit  If  the  logarithmic  plot  of  a  given  set  of  empirical 
data,  l.e.,  the  plot  of  the  points  P,,  ~  (u,,,v,,),  u,,  »  log  x^,ic~^.  Mi  <  <  <  x^, 
Vm  *  log  F(x,,)  (v  a  1,  ...,  m).  Indicates  the  existence  of  an  asymptote  as  u 

It  Is  essential  to  observe  that  the  Initial  shape  parameter  p  of  a 
member  of  the  distribution  class  (:(c)  Is  uniquely  determined  by  the  direction 
angle  9  of  the  asymptote  of  the  graph  of  the  function  v(u).  According  to  (8) 
and  (9),  1-p  *  tan  9.  This  fact  will  be  exploited  in  the  parameter  deter¬ 
mination  method. 


I 
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IV.  DETERMINATION  OF  THE  PARAMETERS 


This  section  presents  the  general  outline  of  the  proposed  parameter 
determination  method  relative  to  the  distribution  class  (:|c).  The  actual  com¬ 
putational  procedure  will  be  established  In  Section  V. 

Determination  of  the  parameters  b,  p,  and  B  relative  to  a  given  empirical 
data  set  requires  the  solution  of  three  simultaneous  equations.  For  nota- 
tlonal  convenience  p  Is  replaced  by  l-o.  Since  the  scale  parameter  b  can  be 
expressed  uniquely  In  terms  of  the  two  shape  parameters  by  means  of  the  first 
moment  (3),  It  Is  actually  necessary  to  have  only  two  equations  Involving  the 
two  shape  parameters.  One  such  equation  can  be  obtained  from  the  second 
moment  upon  elimination  of  b.  It  Is  of  the  form 


h(B,a)  -  r2  (1^)  -  Ar  (f)  r  (■^)  -  o.a 


In  which,  according  to  (4),  0  <  A  <  1.  A  second  equation,  g(8,a)  *  0,  follows 

from  the  function  v(u)  given  In  (8)  If  u  “  0  and  b  Is  eliminated. 

Unfortunately,  the  second  equation  Is  unpleasant  from  a  computational 
point  of  view.  It  Is  desirable,  therefore,  from  a  practical  standpoint,  to 
replace  It  by  some  other  equation  which  can  more  easily  be  handled. 

To  achieve  this  objective,  an  approximating  function  v*(u)  Is  used  for 
the  function  v(u)  with  the  fact  In  mind  that  the  asymptote  of  the  graph  of 
v(u)  determines  the  initial  shape  parameter  uniquely.  For  v*(u)  the  function 

v*(u)  -  ou  +  p(e8u-i)  +  v(0),  a  -  1-p  (11) 

Is  chosen. 

There  are  several  reasons  for  this  choice  of  v*(u): 

(1)  The  graph  of  v*(u)  has  the  asymptote  v*a(u)  ■  cm  -  o  +  v(0)  as 

u  +  -  «,  Its  direction  tangent  o  »  1-p  being  the  same  as  that  of  the  asymptote 

of  the  graph  of  the  original  function  v(u)  (9), 

(2)  The  function  v*(u)  approximates  the  function  v(u)  well  over  the 
Interval  (-»,  0]  (Appendix  B).  Of  course,  regardless  of  the  value  p.  v*(u) 
will  not  approximate  v(u)  for  large  values  of  u,  since  v(u)  +  0  as  u  +  + 

*  whereas  v*(u)  does  not.  This  Is  no  matter  of  concern,  however.  The  Inten¬ 
tion  Is  to  exploit  the  asymptotic  linearity  property  of  v(u)  as  u  4-  -  ». 

(3)  v*(0)  -  v(0),  and 

(4)  The  function  v*(u)  Is  linear  In  Its  coefficients  a  and  p. 

If  v(u)  can  now  be  approximated  by  v*(u)  In  such  a  fashion  that  the 
coefficient  a,  say,  becomes  a  well-defined  function  of  B,  a  ■  a(R),  then  the 
needed  second  equation,  g*(B,a)  ■  a  -  o(B)  ■  0  to  solve  the  problem  results. 
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easiest  way  to  explain  the  procedure  Is  to  go  along  with  an  example. 
Table  1  shows  absolute  frequencies  (FABS)  over  m-14  classes  (K)  with  Inter¬ 
vals  [Xy_i,  Xy)  “  [v-l,v)  of  length  d  «•  l.  The  total  number  of  observations 
Is  N  “  119.  The  data  for  this  example  (Example  Library  Classification: 

EMPEX  #3)  originated  from  Reference  [13].  EMPEX  #3  presents  the  frequency 
distribution  of  119  upper-tropospheric  wind  speeds  measured  over  Nashville 
Tenessee  between  mid-May  and  mid-September  1985.  The  reported  (scalar)  wind 
speed  values  refer  to  the  300  hektopascal  level  which  corresponds  approxima¬ 
tely  to  a  height  of  9.6  km.  The  original  reports  [13]  of  wind  speeds  In  Inte¬ 
gral  values  of  knots  have  been  grouped  here  Into  classes  of  5  knots.  There¬ 
fore,  the  vth  class  Interval  [v-l,  v)  contains  the  observations  from  5v-5  to 
5v-l  knots  (u“l, . . . ,14) . 


TABLE  1.  Empirical  Example  if3  —  Absolute  Frequencies. 


K  XR 

FABS 

1 

2 

‘k'kic’kit 

2 

2.00 

6 

3 

3.00 

14 

1fk-k'k-k-k'k-k-k1fkifkifk1fk1t*-k1fk-k1fkicifk*-kifk'k1fkifk 

4 

4.00 

17 

5 

5.00 

21 

6 

6.00 

14 

7 

7.00 

15 

8 

8.00 

10 

9 

9.00 

7 

10 

10.00 

6 

**************** 

11 

11.00 

2 

***** 

12 

12.00 

1 

*** 

13 

13.00 

3 

******** 

14 

14.00 

1 
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The  relative  frequencies  (FREL)  are  given  in  Table  2  together  with  the  cdf 
values  F  (CUMREL)  at  the  right-hand  Interval  endpoints  (XR)  calculated 
according  to  (2).  This  table  also  shows  the  coordinates  u^  ■  log  vMj"!,  Vy  » 
log  F(v)  of  the  log  cdf  points  Py  »  (uy,  Vy)  in  the  U-  and  V-  columns.  The 
value  of  Mj  -  5.4496  has  been  determined  from  (5).  (It  corresponds  to  26.248 
knots) . 

TABLE  2.  Empirical  Example  -  Relative  Frequencies. 


K 

XR 

FREL 

CUMREL 

U 

V 

1 

1.00 

1.68% 

1.68% 

-1.6955 

-4.0860 

2 

2.00' 

5.04% 

6.72% 

-1.0024 

-2.6997 

3 

3.00 

11.76% 

18.49% 

-0.5969 

-1.6881 

4 

4.00 

14.29% 

32.77% 

-0.3092 

-1.1156 

5 

5.00 

17.65% 

50.42% 

-0.0861 

-0.6848 

6 

6.00 

11.76% 

62.18% 

0.0962 

-0.4751 

7 

7.00 

12.61% 

74.79% 

0.2504 

-0.2905 

8 

8.00 

8.40% 

83.19% 

0.3839 

-0.1840 

9 

9.00 

5.88% 

89.08% 

0.5017 

-0.1157 

10 

10.00 

5.04% 

94.12% 

0.6070 

-0.0606 

11 

11.00 

1.68% 

95.80% 

0.7024 

-0.0429 

12 

12.00 

0.84% 

96.64% 

0.7894 

-0.0342 

13 

13.00 

2.52% 

99.16% 

0.8694 

-0.0084 

14 

14.00 

0.84% 

100.00% 

0.9435 

0.0000 
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The  plot  of  the  points  Pv  (with  P9,  Pjq*  Pi2»  ^13  oniltted  for  reasons 
of  clarity)  is  shown  in  Figure  1.  Inspection  of  the  plot  leads  to  the  conclu¬ 
sion  that  the  class  ('X’)  can  be  applied  for  a  data  flt« 


Digressing  briefly,  a  few  remarks  concerning  cdf  plots  like  the  one 
shown  in  Figure  1  are  offered.  It  is  strongly  recommended  that  the  plot 
be  prepared  for  a  given  empirical  data  set  and  Inspected  carefully  for  the 
following  reasons;  (1)  It  provides  the  first  opportunity  to  decide  whether 
or  not  the  distribution  class  (:fc)  should  be  applied  for  a  data  fit,  and 
(2)  the  plot  provides  the  analyst  with  some  basic  information  about  the  type 
of  distribution  he  is  dealing  with  beyond  that  which  can  be  extracted  from  a 
histogram.  If  an  asymptote  location  can  be  estimated,  its  direction  angle  0 
provides  immediately  an  estimate  of  the  Inltltal  shape  parameter  p  since 
tan  0  ■  a  ■  1-p.  Observe  that  0  <  p  <  1  (J-shaped  pdf)  if  0  <  9  <  it/4, 
p  ■  0  if  9  ■  ir/4  (half  bell-shaped  type  pdf,  purely  exponential),  and  p  <  0 
(hump-shaped  pdf)  if  ir/A  <  9  <  ir/Z. 
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If  now.  In  the  general  case,  the  plot  of  points  Py  ■  (Uy,  Vy)(v-1 , . . . ,m) , 
Uy  “  log  Vy  ■  log  F(x),  with  enumeration  done  such  that  uj  <  U2  <••• 

<u,j_3  <  0  <  u^2  ^  '^m>  ®  given  empirical  data  set  Indicates  that  the 

distribution  class  (:|e)  Is  applicable,  then  there  must  be  numbers  p  *  1-cr  and 
8  (and  b)  such  that  the  function  v(u)  “fits”  the  points  Py.  If  this  Is  so, 
then  If  v*(u)  Is  a  good  approximating  function  of  v(u),  the  same  will  be  true 
for  v*(u)  If  the  parameters  o,  p,  and  8  have  been  properly  chosen. 

To  specify  the  coefficients  <j  and  p  of  v*(u),  perform  a  least  squares 
fit  on  the  points  Py  ■  (uy,  Vyj  with 

ui  <  U2  <  ...  <  u,^3  <  0  <  u,^2  ^  '^K-1  ^  '*<»  (12) 

disregarding  all  others  with  Index  greater  than  tc.  In  the  above  example, 

<  «  8.  The  reasons  for  this  choice  of  a  subset  of  the  points  Py  are  that, 

(1)  points  with  Uy  <  0  over  which  the  quality  of  the  fit  may  be  poor  are 
eliminated  and  (2)  a  sufficient  number  of  points  are  available  to  adequately 
account  for  the  typical  concavity  of  the  graph  of  the  function  v(a)  (Fig.  1). 
Experience  shows  that  a  minimum  of  five  points  Py  with  negative  abscissas 
Uy  are  normally  adequate.  Should  there  be  less  than  five  such  points  under 
scaling  of  X  by  means  of  ,  one  should  use  a  scaling  factor  ic  <  M]^ . 

With  8  In  (11)  as  a  parameter,  the  least  squares  fit  on  the  points 
Py  with  abscissas  (12)  leads  to  a  system  of  two  linear  equations  for  c  and  p 
which  can  easily  be  solved  to  give  a  and  p  as  functions  of  8,  a  *  ct(B), 

0  ■  0(B).  Actually,  only  a  ■  a(6)  Is  needed  for  the  parameter  determination 
procedure.  The  function  p( 8)  Is  useful,  however,  to  judge  the  quality  of  the 
approximation  of  v(u)  by  v*(u)  (Appendix  B). 

Of  course,  It  is  necessary  in  this  process  to  determine  the  numerical 
value  of  v(0)  which  appears  In  (11).  But  this  number  can  easily  be  calculated 
by  means  of  Lagrange-Aitken  Interpolation  over  the  consecutive  points  Pk^4, 

2»  Pic-1  with  u^4  <  u^3  <  0  <  U|^2  ^  ^ic-1* 
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V.  THE  ITERATION  PROCESS 

The  least  squares  fit  on  the  log  cdf  points  Py  ■  (uy,Vy)  with  abscissas 
satisfying  the  inequalities  (12)  by  means  of  the  function  v*(u)  given  in  (14) 
leads  to  the  error  equations 

v*(uy)  »  ouy  +  p(e®'^'^-l)  +  v(o)  “  Vy  +  ey(v  ■  (13) 

Minimization  of  the  sum  of  squares  of  the  errors  y  specifies  the  coefficients 
?  and  p  as  functions  of  the  parameter  8, 


a(B)  -  Di/D,  p(8)  -  D2/D.  (14) 

The  determinants  are  defined  by 

D(8)  “  A11A22  ~  ^1(8)  ■  BA22“CAi2»  D2(  8)  “  CAn  -  BA12  (15) 


with 


K  ic  ic 

All  ■  X)  ^12(8)  -  ^  Uv®v»  A22(8)  “  a2^, 

U“1  V“1  v-1 


B 


K  K 

^yCy^  C(8)  «  23  ay  •  e  -1,  Cy 


(16) 

Vy  -  v(0). 


v"l  v»l 

In  addition  to  the  equation  g*(S,a)  ■  a  -  a(6)  “  0,  use  the  equation 
h(8,a)  ■  0  given  in  (10).  The  coefficient  A  which  appears  in  the  function  h 
is  to  be  determined  by  means  of  the  formulas  (5)  and  (6).  Essential  proper¬ 
ties  of  the  equations  g*  >  0  and  h  -  0  are  discussed  in  Appendixes  A  and  C. 


The  iteration  process  now  proceeds  as  follows.  Set  8  >■  1  in  (13)  and 
calculate  the  value  oi  ■  o(l)  from  (14).  Then  solve  the  equation  h(8,oi)  ■  0. 
As  a  matter  of  fact,  use  of  the  equation  H(a, o)  ■  0  obtained  from  h(  8,  cr)  ■  0 
by  the  substitution  8  ■  a(l-a)"^  reduces  the  Interval  of  the  unknown  from 
(0,  +  •)  to  (0,1).  The  regula  falsi  method  is  used  with  the  starting  value 
a  ■  0.5,  and  a  search  for  the  first  pair  of  functional  values  of  opposite 
sign  is  initiated.  Iteration  is  terminated  when  |  Oy-Oy-i  |  <-10“^.  (The  full 
Newton's  method  has  also  been  used  with  no  essential  Improvement  in  accuracy 
but  the  added  computational  burden  of  having  to  evaluate  the  psl  function.) 


The  solution  8^  of  h(8,cTi)  “  0  is  then  used  to  calculate  the  value 
02  ■  ct(8i)  from  (14).  Proceeding  in  this  fashion,  establish  two  sequences 
{oy}  and  {8y}  which,  provided  the  data  set  is  we 11 -conditioned,  will  converge 
(Appendix  D)  to  numbers  Oq  *  ^  ”  Po  Sq,  respectively.  These  numbers 
Pq  and  80  are  the  final  values  for  the  shape  parameters  p  and  8.  The  final 
value  bg  for  the  scale  parameter  b  is  then  obtained  from  the  first  moment, 

^o  “  Mir((l-po)8o“^)/  r((2-po)Bo~l)»  the  parameter  determination  process 
complete. 
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In  practice,  of  course,  the  Iteration  process  will  be  terminated  when  a 
desired  accuracy  has  been  reached.  For  the  examples  to  be  discussed  In  the 
next  section,  the  criterion  |  Oy-Oy_i  |<  10~2,  |  |<  10-2  „ag  used  and 

seems  to  be  adequate.  Thus,  the  first  pair  of  values  and  By  which  satisfy 
this  criterion  was  taken  as  the  final  values. 
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VI.  EXAMPLES 


This  section  presents  a  number  of  examples  for  the  parameter  determina¬ 
tion  method.  To  demonstrate  Its  efficiency,  several  special  cases  of  the 
distribution  class  (:f«  )  were  selected  for  which,  In  order  to  be  able  to  eval¬ 
uate  the  results  objectively,  the  parameter  values  were  chosen  to  begin  with 
and  then  reconstructed.  The  resulting  errors  In  these  examples  are  entirely 
due  to  errors  arising  from  the  approximation  of  the  function  v(u)  by  the  func¬ 
tion  v*(u).  The  log  F(x)  values  have  been  calculated  directly  from  the  exact 
cdf's  which,  since  all  of  the  examples  are  of  Welbull  type  (l.e.,  p  =  1  -  v), 
are  given  by 

F(x)  -  1  -  exp  -  58,  E  »  xb“'.  (17) 

The  moments  M2  and  M2  have  been  calculated  from  the  formula  (3)  by  means  of 
the  given  b,  p,  and  fi  values.  In  empirical  cases,  additional  errors  will 
arise  from  the  use  of  the  sample  moments. 

There  are  four  examples,  classified  In  our  example  library  as  SYNEX 
(■  synthetic  example)  #8,  #9,  #10,  and  #11.  SYNEX  #8  represents  an  exponen¬ 
tial  distribution,  SYNEX  #11  a  J-shaped  distribution.  The  others  are  of  hump¬ 
shaped  type.  SYNEX  #10  Is  being  presented  in  two  different  versions  relative 
to  the  number  of  classes. 

Tables  3,  4,  5,  6,  and  7  are  essentially  self-explanatory.  The  heading 
Includes  the  original  parameter  values  (p  •  1  -  fi  In  all  cases).  Column  K 
Indicates  the  class  interval  number.  In  the  second  column,  x^  «  XR  gives  the 
right-hand  class  interval  endpoint.  The  interval  length  d  in  each  case  can 
immediately  be  extracted  from  this  column.  The  cdf  values  F(Xy)  ■  CUMREL  are 
shown  In  column  4  as  calculated  from  (17)  for  the  given  b,  p,  and  6  values  up 
to  values  of  Xy  in  such  a  way  that  the  first  three  points  Py  •  (uy,  Vy) 

(v  =  x-2,  x-l,  k)  in  the  fourth  quadrant  of  the  (u,  v)-plane  are  included  in 
the  set  of  points  to  be  used  for  the  least  squares  fit.  Column  3  (which  Is 
actually  of  no  Interest  relative  to  the  SYNEX’ s)  shows  the  relative  frequen¬ 
cies  fj.(Xy)  »  FREL  calculated  from  the  cdf  values.  The  coordinates  Uy  ■  U, 

Vy  *  V  of  the  points  Py  are  given  in  columns  5  and  6,  respectively.  The  last 
column  DV/DU  ■  tan  9m_y_2  contains  the  coordinate  difference  ratios.  It  Is  of 
some  Interest  in  these  SYNEX' s  only. 

The  moments  M2  *  Ml ,  M2  ■  M2,  and  the  numbers  A  ■  M2^/M2  and  v(0)  ■  VO 
are  given  In  the  center  block  of  each  table.  In  each  case,  the  numerical 
value  of  v(o)  has  been  calculated  by  means  of  four-point -Lagrange-Altken 
interpolation  as  explained  at  the  end  of  Section  IV. 

The  last  block  In  each  table  contains  the  numerical  results  for  each 
iteration  step.  The  final  values  for  the  parameters  appear  In  the  lower 
right-hand  corner.  Iteration  In  each  example  has  been  started  with  9*1 
and  terminated  at  |  fiy-6y_2  [  <  10~2. 

It  should  be  observed  that  the  fact  that  the  examples  are  of  Welbull  type 
has  nowhere  been  used  In  the  Iteration  process,  l.e.,  p  and  R  (and  b)  have 
been  Individually  determined. 
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TABLE  3.  SYNEX  #8:  Exponential  Weibull  Distribution  (B=l,  P=0,  BETA=1) 


K  XR  FEIEL  CUMREL  U  V  DV/DU 


1 

0.10 

9.52% 

9.52% 

-2.303 

-2.352 

1.022 

2 

0.20 

8.61% 

18.13% 

-1.609 

-1.708 

0.930 

3 

0.30 

7.79% 

25.92% 

-1.204 

-1.350 

0.882 

4 

0.40 

7.05% 

32.97% 

-0.916 

-1.110 

0.836 

5 

0.50 

6.38% 

39.35% 

-0.693 

-0.933 

0.793 

6 

0.60 

5.77% 

45.12% 

-0.511 

-0.796 

0.751 

7 

0.70 

5.22% 

50.34% 

-0.357 

-0.686 

0.711 

8 

0.80 

4.73% 

55.07% 

-0.223 

-0.597 

0.672 

9 

0.90 

4.28% 

59.34% 

-0.105 

-0.522 

0.635 

10 

1.00 

3.87% 

63.21% 

0.000 

-0.459 

0.599 

11 

1.10 

3.50% 

66.71% 

0.095 

-0.405 

0.566 

12 

1.20 

3.17% 

69.88% 

0.182 

-0.358 

0.533 

13 

1.30 

2.87% 

72.75% 

0.262 

-0.318 

0.502 

Ml  -  UOOOO 

M2 

2.0000 

A 

- 

0.5000 

VO 

-0.4587 

Iteration 

#1: 

RHO 

-0.4093 

PO 

ai 

0.0191 

SIGMA  - 

0.9809 

BETAO- 

1.0201 

ALPHAO- 

0.5050 

BO 

- 

1.0479 

Iteration 

#2: 

RHO 

-0.3988 

PO 

0.0226 

SIGMA  - 

0.9774 

BETAO- 

1.0239 

ALPHAO- 

0.5059 

BO 

m 

1.0571 

TABLE  4.  SYNEX  #9.  Weibull  Distribution  (B-1,  P=-l,  BETA=2) 


K 

XR 

FREL 

CUMREL 

U  V 

DV/DU 

1 

0.10 

1.00% 

1.00% 

-2.182  -4.610 

2.113 

2 

0.20 

2.93% 

3.92% 

-1.489  -3.239 

1.978 

3 

0.30 

4.69% 

8.61% 

-1.083  -2.453 

1.939 

4 

0.40 

6.18% 

14.79% 

-0.796  -1.912 

1.881 

5 

0.50 

7.33% 

22.12% 

-0.572  -1.509 

1.805 

6 

0.60 

8.11% 

30.23% 

-0.390  -1.196 

1.714 

7 

0.70 

8.50% 

38.74% 

-0.236  -0.948 

1.608 

8 

0.80 

8.53% 

47.27% 

-0.102  -0.749 

1.491 

9 

0.90 

8.24% 

55.51% 

0.015  -0.589 

1.365 

10 

1.00 

7.70% 

63.21% 

0.121  -0.459 

1.233 

11 

1.10 

6.97% 

70.18% 

0.216  -0.354 

1.097 

Ml  =  0.8862 

M2 

1.0000 

A 

0.7854 

VO 

-0.6087 

Iteration 

#1: 

RHO 

-0.7523 

PO  - 

-1.1471 

SIGMA  - 

2.1471 

BETAO- 

1.8121 

ALPHAO- 

0.6444 

BO 

0.8922 

Iteration 

n: 

RHO 

-0,3812 

PO  - 

-1.0052 

SIGMA  « 

2.0052 

BETAO- 

1.9921 

ALPHAO- 

0.6658 

BO 

0.9959 

Iteration 

#3: 

RHO 

-0.3454 

PO  - 

-0.9882 

SIGMA  =• 

1.9882 

BETAO- 

2.0168 

ALPHAO- 

0.6685 

BO 

1.0089 

Iteration 

H: 

RHO 

-0.3410 

PO  -  -0.9860 

SIGMA  - 

1.9860 

BETAO- 

2.0199 

ALPHAO- 

0.6689 

BO  - 

1.0105 
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TABLE  5.  SYNEX  #10:  Weibull  Distribution  (B“2,  P=-2,  BETA=3) 


K 

XR 

FREE 

CUMREL 

U 

V 

DV/DU 

1 

0.10 

0.01% 

0.01% 

-2.883 

-8.987 

3.118 

2 

0.20 

0.09% 

0.10% 

-2.189 

-6.908 

2.999 

3 

0.30 

0.24% 

0.34% 

-1.784 

-5.693 

2.997 

4 

0.40 

0.46% 

0.80% 

-1.496 

-4.832 

2.992 

5 

0.50 

0.75% 

1.55% 

-1.273 

-4.167 

2.983 

6 

0.60 

1.11% 

2.66% 

-1.091 

-3.625 

2.969 

7 

0.70 

1.53% 

4.20% 

-0.937 

-3.171 

2.949 

8 

0.80 

2.00% 

6.20% 

-0.803 

-2.781 

2.922 

9 

0.90 

2.51% 

8.71% 

-0.685 

-2.441 

2.886 

10 

1.00 

3.04% 

11.75% 

-0.580 

-2.141 

2.842 

11 

1.10 

3.58% 

15.33% 

-0.485 

-1.876 

2.788 

12 

1.20 

4.10% 

19.43% 

-0.398 

-1.639 

2.724 

13 

1.30 

4.59% 

24.01% 

-0.318 

-1.427 

2.649 

14 

1.40 

5.02% 

29.04% 

-0.243 

-1.237 

2.562 

15 

1.50 

5.38% 

34.42% 

-0.174 

-1.067 

2.465 

16 

1.60 

5.65% 

40.07% 

-0.110 

-0.915 

2.356 

17 

1.70 

5.82% 

45.89% 

-0.049 

-0.779 

2.236 

18 

1.80 

5.87% 

51.76% 

0.008 

-0.659 

2.107 

19 

1.90 

5.81% 

57.57% 

0.062 

-0,552 

1.968 

20 

2.00 

5.64% 

63.21% 

0.113 

-0.459 

1.822 

Ml  -  1.7860 

M2 

3.6110 

A 

m 

0.8833 

VO 

-0.6746 

Iteration  #1: 

RHO 

-0.7792 

PO 

* 

-2.1569 

SIGMA  - 

3.1569 

BETAO- 

2.7861 

ALPHAO- 

0.7359 

BO 

- 

1.8925 

Iteration  #2: 

RHO 

-0.3418 

PO 

>  . 

-2.0029 

SIGMA  - 

3.0029 

BETAO- 

2.9976 

ALPHAO- 

0.7498 

BO 

- 

1.9985 

Iteration  #3: 

RHO 

-0.3276 

PO 

.  . 

-1.9963 

SIGMA  - 

2.9963 

BETAO- 

3.0075 

ALPHAO- 

0.7505 

BO 

2.0031 
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TABLE  6.  SYNEX  //13:  Weibull  Distribution  (B=2,  P=12,  BETA=3) 


K 

XR 

FREL 

CUMREL 

U  V 

DV/DU 

1 

0.30 

0.34% 

0.34% 

-1.784  -5.693 

3.191 

2 

0.60 

2.33% 

2.66% 

-1.091  -3.625 

2.983 

3 

0.90 

6.05% 

8.71% 

-0.685  -2.441 

2.922 

4 

1.20 

10.72% 

19.43% 

-0.398  -1.639 

2.789 

5 

1.50 

14.99% 

34.42% 

-0.174  -1.067 

2.563 

6 

1.80 

17.34% 

51.76% 

0.008  -0.659 

2.238 

7 

2.10 

16.82% 

68.58% 

0.162  -0.377 

1.825 

8 

2.40 

13.66% 

82.24% 

0.296  -0.196 

1.360 

Ml  -  1.7860 

M2 

3.6110 

A  » 

0.8833 

VO 

-0.6746 

Iteration 

#1: 

RHO 

-1.3093 

PO  -  - 

-2.4482 

SIGMA  - 

3.4482 

BETAO- 

2.4435 

ALPHAO- 

0.7096 

BO 

1.6902 

Iteration 

#2: 

RHO 

-0.3908 

PO  -  - 

-2.0325 

SIGMA  » 

3.0325 

BETAO- 

2.9538 

ALPHAO* 

0.7471 

BO 

1.9778 

Iteration 

in: 

RHO 

-0.2972 

PO  -  - 

-1.9730 

SIGMA  - 

2.9730 

BETAO- 

3.0435 

ALPHAO- 

0.7527 

BO 

2.0195 

Iteration 

#4 : 

RHO 

-0.2840 

PO  -  - 

-1.9643 

SIGMA  - 

2.9643 

BETAO- 

3.0574 

ALPHAO- 

0.7535 

BO 

2.0257 

Iteration 

*5: 

RHO 

-0.2821 

PO  -  - 

-1.9630 

SIGMA  - 

2.9630 

BETAO- 

3.0595 

ALPHAO- 

0.7537 

BO  - 

2.0266 
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TABLE  7.  SYNEX  //ll:  Weibull  Distribution  (B=5,  P=0.5,  3ETA=0.5) 


K 

XR 

FREL 

CUMREL 

U 

V  DV/DU 

1 

1.00 

36.06% 

36.06% 

-2.303  -1 

.020 

0.443 

2 

2.00 

10.81% 

46.87% 

-1.609  -0 

.758 

0.378 

3 

3.00 

7.04% 

53.91% 

-1.204  -0 

.618 

0.345 

4 

4.00 

5.20% 

59.12% 

-0.916  -0 

.526 

0.320 

5 

5.00 

4.10% 

63.21% 

-0.693  -0 

.459 

0.300 

6 

6.00 

3.35% 

66.56% 

-0.511  -0 

.407 

0.283 

7 

7.00 

2.81% 

69.37% 

-0.357  -0 

.366 

0.268 

8 

8.00 

2.40% 

71.77% 

-0.223  -0 

.332 

0.255 

9 

9.00 

2.08% 

73.86% 

-0.105  -0 

.303 

0.243 

10 

10.00 

1.83% 

75.69% 

0.000  -0 

.279 

0.232 

11 

11.00 

1.62% 

77.31% 

0.095  -0 

.257 

0.222 

12 

12.00 

1.45% 

78.76% 

0.182  -0 

.239 

0.213 

13 

13.00 

1.30% 

80.06% 

0.262  -0 

.222 

0.205 

Ml  -  10.0000 

M2 

600.0000 

A 

0.1667 

VO 

-0.2785 

Iteration 

#1: 

RHO 

-0.1775 

PO 

0.6110 

SIGMA  » 

0.3890 

BETAO- 

0.5897 

ALPHAO- 

0.3709 

BO 

- 

11.3206 

Iteration 

#2: 

RHO 

-0.3785 

PO 

s 

0.5568 

SIGMA  - 

0.4432 

BETAO- 

0.5409 

ALPHAO- 

0.3510 

BO 

- 

7.5874 

Iteration 

#3: 

RHO 

-0.4325 

PO 

0.5449 

SIGMA  - 

0.4551 

BETAO- 

0.5315 

ALPHAO- 

0.3470 

BO 

6.9509 
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TABLE  8.  Empirical  Example  //3 


K  XR  FREL  CUMREL  U- 


1 

1.00 

1.68% 

1.68% 

-1.6955 

-4.0860 

2 

2.00 

5.04% 

6.72% 

-1.0024 

-2.6997 

3 

3.00 

11.76% 

18.49% 

-0.5969 

-1.6881 

4 

4.00 

14.29% 

32.77% 

-0.3092 

-1.1156 

5 

5.00 

17.65% 

50.42% 

-0.0861 

-0.6848 

6 

6.00 

11.76% 

62.18% 

0.0962 

-0.4751 

7 

7.00 

12.61% 

74.79% 

0.2504 

-0.2905 

8 

8^00 

8.40% 

83.19% 

0.3839 

-0.1840 

9 

9.00 

5.88% 

89.08% 

0.5017 

-0.1157 

10 

10.00 

5.04% 

94.12% 

0.6070 

-0.0606 

11 

11.00 

1.68% 

95.80% 

0.7024 

-0.0429 

12 

12.00 

0.84% 

96.64% 

0.7894 

-0.0342 

13 

13.00 

2.52% 

99.16% 

0.8694 

-0.0084 

14 

1 

14.00 

0.84% 

100.00% 

0.9435 

0.0000 

Ml  »  5.4496 

M2 

37.1064 

A  =» 

0.8003 

VO 

-0.5792 

Iteration 

in-. 

RHO 

-1.1896 

PO 

-1.6938 

SIGMA  =* 

2.6938 

B£TA0« 

1.5309 

ALPHA0= 

0.6049 

BO 

4.0079 

Iteration 

n-. 

RHO 

-0.6288 

PO 

-1.4580 

SIGMA  = 

2.4580 

BETA0= 

1.7002 

ALPHA0= 

0.6297 

BO 

4.7563 

Iteration 

It3: 

RHO 

-0.5378 

PO 

-1.4118 

SIGMA  - 

2.4118 

BETAO- 

1.7417 

ALPHAO- 

0.6353 

BO 

4.9226 

Iteration 

H: 

RHO 

-0.5187 

PO 

-1.4017 

SIGMA  =• 

2.4017 

BETAO- 

1.7512 

ALPHAO- 

0.6365 

BO  - 

4.9597 
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The  accompanying  Figures  2,  3,  4,  5,  and  6  show  the  log  cdf  point  plots. 

In  Figure  4  the  points  Pj  and  P2  are  not  shown. 

The  calculations  have  been  performed  on  an  IBM-PC  compatible  microcom¬ 
puter  (without  math  co-processor)  in  compiled  MS  Basic.  For  empirical  samples 
of  frequency  distributions  with  approximately  40  classes,  actual  computing 
time  was  less  than  60  sec. 

The  evaluation  of  the  differences  between  the  obtained  parameter  values 
Po»  ^o*  ^o»  the  original  ones,  p,  8,  b,  shows  that  max  {  |  Po“P  |  •  |  8o~3  |  , 
|bo-b  |[  Is  <  6*10~2  In  SYNEX  //8,  <  2*10“2  in  SYNEX  #9,  <  8-10"3  in  SYNEX 
#10a,  <  6*10"2  in  SYNEX  #10b.  In  SYNEX  //ll,  max  j  |  pQ-P  |  »  |  |  }  <  5-10-2, 

but  1.950  <  bg  -  b  <  1.951.  The  large  error  in  the  scale  parameter  demon¬ 
strates  the  well-known  sensitivity  of  this  parameter  to  small  changes  In  the 
others  for  J-shaped  distributions.  The  culprit  in  this  matter,  of  course.  Is 
the  error  In  the  Initial  shape  parameter  p.  Ultimately,  this  error  results 
from  the  fact  that  the  class  Interval  length  In  the  example  used  Is  too  big 
for  this  type  of  distribution. 

Before  closing  this  section,  briefly  return  to  the  empirical  example, 

EMPEX  #3  considered  In  Section  IV.  The  first  and  second  moments  are  (from 
(5)  and  (6),  respectively)  “  Ml  »  5.4496  and  M2  “  M2  *  37.1064  as  shown  In 
the  center  block  of  Table  8,  which  also  shows  the  numerical  values  of 
A  »  Mi^/M2  and  v(0)  ■  VO.  Starting  with  8  »  1,  after  the  4th  Iteration  step, 
the  final  parameter  values  bg,  Pg,  and  6g  shown  In  the  lower  right-hand  corner 
of  Table  8  are  obtained.  The  least  squares  approximations  Include  the  points 
Py  for  V  ■  1,...,8.  No  goodness-of-flt  test  will  be  performed  on  the  final 
parameter  values  In  this  paper.  This  will  be  left  to  a  separate  publication 
which  will  deal  exclusively  with  empirical  examples.  However,  It  Is  worth 
mentioning  that  a  recently  performed  maxi mum-1 Ike 11 hood  estimation  of  the 
parameters  of  EMPEX  //3  resulted  in  the  final  values  p  *  -  1.466,  8  ■  1.720, 
and  8  =  4.795. 
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The  Equation  h(R,a)  »  0 

Return  to  the  equation  h(R,o)  “0,  0  <  A  <  1,  given  In  (10).  The  first 
partial  derivatives  of  h(R,o)  are 


'*'(y)  *  d  log  r(y)/dy  [12;8.360]  being  the  psl  function.  If 
for  ft*  >0,  o*  >  0,  then 

©■■ ; •i^yay  •  i*?)  ■ 


h(R*,a*)  ■  0 


Use  of  the  series  expansion  for  '('(y)  [12;8. 362.1] 


’’'(y)  "  -  Y 


-E  . 


y+v  1+v 


Y  ■  -  yCl)  being  Euler's  constant  [12;8,367.1]  results  In 


CO 


r^  -  - — - >0,  R*  >  0,  a*  >  0. 

(o*  +  vR*)(l+o*  +  v«*)(2+o*  +  vR*) 


Consequently, 


A-1 


In  other  words,  considered  as  a  function  of  o  >  0  for  fixed  6  >  0,  h(<^,a)  has 
a  positive  derivative  at  each  of  its  zeros  o.  Since  h(fi,a)  is  continuous  as  a 
function  of  a,  it  follows  that,  for  fixed  >  0,  h(B,o)  ■  0  can  have  at  most 
one  root  a  >  0  and  that  h(R,cr)  Increases  from  negative  to  positive  values 
across  the  root • 

Again,  if  fi*  and  a*  are  positive  numbers  such  that  h(fi*,(T*)  ■  0,  then 


The  following  formula  can  be  established. 


/l+cT*\  /a*V  /2+o*\  V 


s„  -  -  2B* 


( o*  +  v6*)  (1+0*  +  vB*)(2+o*  +  vB*) 


<  0,  R*  >  0,  o*  >  0, 


so  that 


Therefore,  considered  as  a  function  of  R  >  0  for  fixed  o  >  0,  h(R,a)  has  a 
positive  derivative  at  each  of  its  zeros.  Continuity  again  implies  that,  for 
fixed  o  >  0,  h(8,o)  ■  0  has  at  most  one  root,  R  >  0,  and  h(R,o)  Increases 
from  negative  to  positive  values  across  the  root. 


To  continue  the  investigation  of  the  properties  of  the  equation 
h(R,a)  ■  0  consider  the  ratio 


^  (I)  ^  (t) 


which  is  a  continuous  function  of  R  and  a  in  the  open  domain  R  >  0,  cr  >  0. 
Set  (l+(j)R“^  •  a,  R“^  ■  y.  Then  (A-9)  can  be  expressed  as  an  infinite  pro¬ 
duct  [12;8. 325.1], 


A-2 


09 


C(R,o) 


r((t)  r(a) 

r(rt-Y)  r(orf^) 


n  \jl  (a+v)B)(^'  (a+v)B)] 

v«o 


n  'v 

v-O 


0  < 


(l+<H-ve)2 


(1+ct+v6)2 


B  >  0,  a  >  0. 


(A-2) 


The  Infinite  product  converges  (absolutely)  for  every  ft  >  0  and  a  >  0  since 
the  series 


^  (1+C3+Vfi)“2  <  1+3"2  2  <  1  +  2B“2. 

V*1 


The  Inequality  0  <  ty  <  1  Implies  that  0  <  C(R,a)  <  1.  Furthermore,  because 
of  convergence  of  the  product ,  the  series 

x:  log  .  £  log  (i  -  ) 

v«o 

converges.  Its  uth  partial  sum  Is  denoted  by 
u 

■  E  (■  -  TiT^)  ■ 

To  Investigate  the  behavlo**  of  the  function  C  defined  in  (A-1),  first 
consider  a  >  0  fixed.  Let  B  ■  m”l,  m  being  a  positive  Integer.  Then 

-  «  <  log  - ^ \  _<  log 

'  (l+o+\mui)2/ 


(i  -  _i_\ 

\  (1+(H-1)2/ 


^  0  (v  “  0,1, ...,m). 


and,  hence,  for  u  >  m. 


ly  <  m  log  (l - ^  +  2  log  (l - i \ 

V  (2+a)2  /  y-nH-1  ( 1+v+vm"^  )^/ 

< 


Therefore , 


0  <  11m  e^^w  , 

1]++  00  U++  “ 


<  exp 


(" 


m  log 


1 

(2+0)2 


(' 


1  \in 
(2+0)2/ 


The  right-hand  side  can  be  made  arbitrarily  small  If  m  Is  sufficiently  large. 
In  other  words,  for  every  fixed  o  >  0,  the  Infinite  product  C(f?,o)  will  be 
arbitrarily  small  If  R  >  0  Is  sufficiently  small,  l.e.,  for  every  fixed  o  >  0, 
It  diverges  to  zero  as  R  i  0. 


To  Investigate  the  behavior  of  C(8,o)  for  o  >  0  fixed  and  R  large,  use 
the  partial  products  of  (A-10), 

u 


JJ(1+o+v8)2 


v«o 


Relative  to  R,  P  Is  a  polynomial  of  degree  2v  with  leading  coefficient  y!. 
u 

The  denominator  J[  (l+o+vR)2  Is  a  polynomial  In  R  also  of  degree  2u  with  lead- 

\jmo 

Ing  coefficient  (l+o)2u!.  Therefore,  for  every  fixed  cr  >  0  and  for  every  u, 
u 

Py  ■  TT  t^  f  1  “  - - -  as  R  +  +  ». 

(l+a)2 

v»o 

This  means  that,  for  every  fixed  o  >  0,  the  Infinite  product  converges  to 
1  “  (l+o)“2,  0  <  1  -  (l+o)~2  <  1,  as  R  +  +  «. 

From  these  results  two  preliminary  conclusions  are  drawn: 

(1)  Since  the  function  C(8,cj)  >  0  in  (A-1)  can  be  made  arbitrarily 
small  for  every  fixed  a  >  0  If  R  >  0  Is  sufficiently  small,  the  function 
h(R,a)  with  0  <  A  <  1  will  be  negative  for  every  fixed  o  >  0  If  R  >  0  is  suf¬ 
ficiently  small  and 

(2)  If  1  -  (l+o)"2  <  A,  l.e..  If 


0  <  o  <  OA 


(A- 3) 


h(R,o)  <  0  for  every  R  >  0.  On  the  other  hand,  h(R,a)  >  0  for  every  a  >  cta 
If  R  Is  sufficiently  large. 


Consequently,  since  h  Is  a  continuous  function  of  R,  observe  that,  for 
every  fixed  a  >  a^,  h( ft,  a)  *  0  has  at  least  one  positive  root.  Earlier  it 
was  observed  that,  for  every  fixed  o  >  0,  h(ft,a)  *  0  has  at  most  one  root. 

It  follows  that,  for  every  fixed  a  >  cta*  h(  ft,  a)  =  0  has  exactly  one  positive 
root  ft. 

With  the  established  existence  of  at  least  one  point  P*  =  (ft*,  a*),  ft*  > 
0,  0*  >  CTA,  such  that  h(ft*,o*)  =  0,  discussion  of  the  equation  h(fi  ,a)  =  0 
can  be  completed  by  means  of  the  implicit  function  theorem.  Its  conditions 
are  satisfied  in  some  neighborhood  of  P*:  h(ft*,o*)  =  0,  (h(j)p*  >  0,  (h(:^)p*  > 

0,  h(  ft,  a) ,  hjj(8,a),  h(5(ft,a)  being  continuously  differentiable  in  the  domain  0 

<  ft  <  +  00,  OA  <  a  <  +  00.  Consequently,  there  exists  a  closed  Interval  [fti,ft2] 
such  that  0  <  ft^  <  ft*  <  ft2  and  a  one-valued  continuous  function  o  =  ct( ft)  such 
that  h(ft,a(ft))  =  0  for  every  ft€  [6i,ft2].  The  Implicitly  defined  function 
ct(  ft)  is  even  continuously  differentiable  in  (fti,ft2)*  Its  derivative  is  given 
by 


dCT(  ft) 
dft 


hq(ft,a(S)) 

ha(fl,^(fi)) 


ft€  (fti, ft2), 


l.e.,  (j(  ft)  is  a  monotonlcally  decreasing  function  of  ft  for  fti  £  ft  ft2* 

The  domain  of  existence  of  the  implicit  function  a( B)  can  now  be  extended 
to  all  of  0  <  6  <  +  «  by  the  following  arguments. 

Suppose  5(6)  could  not  be  continued  to  the  right  of  some  point  8  >  0. 

Then  there  would  exist  a  point  P  on  the  line  S  S  with  coordinates  S  and  5, 

OA  <  5  <  +  »,  such  that  every  neighborhood  of  P  would  contain  infinitely  many 
points  P  of  the  graph  of  the  function  3(6)  with  abscissas  8  <  6.  At  each  of 
these  points  h(6,o)  ■  0.  Then  P  would  be  a  limit  point  of  such  points  P. 
Because  of  continuity  of  h(8,a)  this  would  imply_h(6,a)  ■  0.  But  then  the 
function  3(6)  could  be  extended  to  the  right  of  6  by  the  original  arguments. 
Analogous  considerations  apply  for  continuation  to  the  left. 

Since  h(6,o)  ■  0  has  exactly  one  root  6  >  0  for  every  o  >  oa,  the  range 
of  5(8)  is  the  Interval  (crA»+  ")  and,  as  a  consequence  of  monotonlclty, 

5(8)  +  +  «  as  6  +  0,  5(8)  +  oa  as  6  +  +  ». 


A-5/(A-6  blank) 


APPENDIX  B 


Approximation  of  v(u)  by  v*(u) 

Return  to  the  function  v(u)  defined  In  (8),  replacing  by 

r(  ( 1“P)  B“V  r(  (2-p)  leaving,  however,  Mjb"^  In  the  argument  of  the  <t-functlon 
unchanged  for  notatlonal  convenience.  Then 


To  approximate  v(u)  the  function  v*(u)  given  In (11)  was  used  with 


numerically  to  be  determined  by  four-point  Lag range -Alt ken  Interpolation  from 
given  points  of  the  log  cdf  plot. 

Subtraction  of  v*(u)  from  (B-1)  results  In 


v(u)  -  v*(u)  »  p  +  log  <5(0)  -  pef^'^,  (B-2) 

where,  for  notatlonal  convenience,  $(u)  stands  for  the  function  if  as  It 
appears  In  (B-1),  ^(0)  for  Its  value  at  u  ■  0.  Since  |  v(u)  -  v*(u)|  must  be 
small  over  a  suitable  u-lnterval  and  since  |  v(u)  -  v*(u)|  must  go  to  0  as 
u+  -  oo,  from  (B-2)  It  can  be  seen  that  the  two  constants  must  satslfy  the 
equation 

p  -  log  iKO). 

Now  set  (l-p)6“^  *  o,  (Mib~^)f^ 

[12;9.210.1] , 

■Ku)  ■  <t(  a,  1+a;  -  ce^^) 


.  1  -  -£L  i_  cefi«  +^1-  c2  e2Su  .  ^  i_  ^3  e3fiu  +  .  ...  . 
1+a  1!  2+a  2!  3+a  3! 


(B-3) 

c,  and  expand  'J(u)  Into  Its  power  series 


B-1 


If  each  of  the  exponential  functions  Is  expanded  into  its  power  series,  the 
series  for  $  can  be  rearranged  as  follows: 


al  al«  al, 

<P(u)  ■  1 - c  + - - c^  +  -  ... 

1+0  1!  2+0  2!  3+0  3! 

—  cf—  (eu)  +  —  (Bu)2  +  l-(eu)3  +  ...1 

1+0  1!  [l!  2!  3r  J 

+  —  —  c2  f—  (2Bu)  +  —  (2Bu)2  +  —  (2Bu)3  +  . .  .1 

2+0  2!  [l!  2!  3!  J 


- —  —  c2  i_  (3Bu)  +  —  (36u)2  +  —  (3Bu)3  +  . . .  +  - 

3+0  3!  Ll!  2!  3! 


The  series  in  the  first  row  is  equal  to  $(0).  The  series  in  brackets  in  the 
vth  following  row  is  equal  to  e'^*^'^-l.  Therefore, 

00 

«(u)  =  HO)  -  V  (-l)'^l  —  —  cV  (ev8u-i). 

vfo  v! 
v*! 

Denote  the  Infinite  series  by  A(u).  Then 

$(u)  -  <KO)[l-A(u)<f"kO)]  (B-4) 

and  consequently, 

log  $(u)  »  log  <f(0)  +  log[l  -  A(u)'5~^(0)]  . 

Return  with  this  expression  to  (B-2)  and  obtain 

|v(u)  -  v*(u)|  -  j  p  +  log[l  -A(u)$“l(0)]  -  I  .  (B-5) 

The  identity  (B-4)  shows  that  1  -  A(0)$“^(0)  -  1.  Furthemnore,  0  <  $(0)  < 

$(u)  <  1  for  u€  (-  «,0),  and  <S(u)  f  1  as  u  +  -  «.  Therefore,  1  -  A(u)<t~l(0)  + 
as  u  4'  -  Consequently, 

1  <  1  -  A(u)<&-1(0)  <  <>-1(0),  u€  (-»,0], 
which  Implies 

0  <  log(l  -  A(u)$-1(0)]  <  -  log  HO),  u€  (-« »,  0], 


B-2 


1 


From  (B-5)  the  following  estimate  now  results: 


v(u)  -  v*(u)  <  p  -  log  *(0)  -  pe^^^  ,  u  0. 


If  (B-3)  holds,  this  reduces  to 


I  v(u)  -  v*(u)  I  <  I  P  I  u  £  0, 

and  v(0)  -  v*(0)  *  0.  Since  |  v(u)  -  v*(u)  |  +  0  as  u  the  maximum  error 

In  the  approximation  occurs  at  some  Uq  <  0.  Therefore 


v(u)  -  v*(u)  p  uniformly  In  u  0. 


The  error  over  the  Interval  [0,u^]  Is  Immaterial  since  the  objective  Is 
to  approximate  v(u)  as  u  +  -  «. 


8-3/ (B-4  blank) 


The  Coefficient  <t  as  a  Function  of  R 


Next  the  properties  of  the  function  a(R)  defined  In  (14)  together  with 
(15)  and  (16)  are  investigated. 

First  of  all,  establish  the  fact  that  o(R)  >  0  f  or  PC  (0,+  ■»>).  For  two 
points  Py  »  (uy,v^)( v»l , 2)  with  ui  <  U2  and  the  point  Vq  “  ro,v(0)) 

(2  2\/22\  2  2 
“2Jvl  ^2/  "^“1®1  ^2®2)  "(“132  ~  “232) 

and,  by  induction, 

D  -  All  A22  -  Ai2  =  2  (u^a^  -  u^aA  2 

l<p<v<k  ' 

for  any  number  <  of  points  Py  »  (Uy,  Vy)  with  the  abscissas  ordered  as  above. 

Now  look  at  the  terms 

UySy  -  UySy  »  Uy(e^'^“-1)-  Uy(e'^“u-1),  l<u<v<k,  p  >  0.  (C-1) 

Set  6uy »  X,  PUy  -  y,  X  <  y,  X  0,  y  0.  Then  (C-1)  changes  Into  x(ey-l)  - 
y(e*-l)  and  y  »  ox  leads  to  the  function 

f(x)  -  xg(x),  g(x)  -  e<«  -  1  -  ct(ex-l),  g(0)  -  0,  (C-2) 

g'(x)  -  aex  (e"^^'®^*-l) 

Distinguish  the  three  possible  cases. 

1.  0  <  X  <  y  ■  ox,  1  <  a  <  +  «»,  so  that  g'(x)  >  0,  x  >  0.  Since  g(0)  ■  0, 

g(x)  >  0,  X  >  0  and,  hence  f(x)  >  0,  x  >  0.  Therefore  u^ay  -  UySy  >  0, 

0  <  Uy  <  Uy. 


2.  X  <  0 
>  0,  f(x)  <  0, 


<y-ax,  -«<a<0,  g’(x)  <  0. 

X  <  0,  and  hence,  UySy  -  Uya^  <  0, 


This-  and  g(0) 

Uy  <  0  <  Uy. 


0  imply  g(x) 


3.  X  <  y 
f(x)  >  0,  Uyay 


ox  <  0,  0  <  a  <  1,  g'(x)  >  0.  g(0)  -  0  leads  to  g(x)  <  0, 
UySy  >0,  0  <  Uy  <  Uy  <  0. 


Therefore,  for  (C-1) 


“u3y 


I  <  0.  “u“v  >  0* 

\  ^  0,  “yUy  K  0. 


(C-3) 


C-1 


Consequently,  D(  6)  >  0,  R  (0,  +  «»). 


Next,  look  at  Di(B).  For  two  points, 

Di  ■  (uici  +  U2C2)(a2l  +  a^2)  “  a2C2)(uiai  +  U2a2) 

-  (uia2  -  U2ai)(cia2  -  C2ai) 

and,  by  Induction, 

D]^  ■  BA22”^^12"  (u jja-y“Uya|j) (c ya y  “  c^ay)  • 

K  p<  v<k 

Investigate  the  terms 

CySy  -  Cyay  »  [Vy  -  V  (  0  )  ]  (  6  ''“I  )  ”  [  V  y-V  (  0  )  ]  (  S  ^^1  )  , 
l<u<^Kk,  fl>0. 

Division  of  (C-3)  by  UyUy  ^  0  results  In 


ay  ay 

-  >  0  In  any  case. 

u„  u,, 


(C-4) 


Let  ty  ■  CyUy“^,ry  ■  c^ydy”^  ,  and  assume  0  <  ry  <  ty  Since  r^  may  be 
Interpreted  as  tan  9^,  9^  being  the  angle  between  the  horizontal  positively 
oriented  line  through  and  the  line  through  P,^  and  Vq  *  (0,v(0)),  the  last 
assumption  Implies  concavity  of  the  location  of  the  three  points  Py,  Py,  and 
Vq.  Then,  since  a^^“l  >  q,  (C-4)  Implies 


®V  fy  ay 

-  —  >  0  In  any  case. 

Uy  ry  Uy 

Multiplication  by  UyUyTy  yields 

~  ^yT^^y  *  CySy  “  CySy 


>  0,  UyUy  >  0, 
<  0,  UyUy  <  0. 


(C-5) 


On  the  basis  of  this  Inequality  and  (C-3),  Di(R)  >  0,  R  (0,  +  ob)  ,  provided 
the  points  Py  are  concavely  located.  Consequently,  under  this  concavity 
condition,  the  coefficient  a(  R)  of  the  approximating  function  v*(u)  Is  a  posi¬ 
tive  function  of  R  >  0. 

The  following  remark  Is  essential  at  this  stage.  In  practical  situations, 
all  of  the  coordinates  of  the  points  Py  (v«l,...,k)  of  a  given  empirical  data 
set  may  not  satisfy  the  Inequality  (C-5).  Indeed,  this  Is  frequently  the  case. 


C-2 


However,  violation  of  (C-5)  will  occur  only  for  points  Py  with  v  equal  or 
close  to  1,  since  the  smoothing  effect  of  the  cumulative  frequencies  elim¬ 
inates  this  occurrence  for  large  values  of  v.  In  other  words.  If  the  points 
Py  are  sufficiently  smoothly  located,  Di  and,  consequently,  a,  will  still  be 
positive.  If,  however.  In  a  practical  situation,  D]^  should  turn  out  to  be 
always  negative  or  zero,  then  this  is  a  clear  Indication  that  the  class  (sf;) 
of  distributions  cannot  be  used  for  a  data  fit. 

Turn  now  to  the  derivative  of  o( 8)  with  respect  to  the  parameter  R.  It 
is  given  by 

a’  -  D-2  {D[BA'22  “  CAia-CA'^]  -Di[AiiA'22  “  2Ai2A'i2]}  ,  (C-6) 

k  k  k 

Ai2  =  ^  A22  ■  2  ^  UySyby,  C '  *  Uyb^y,by*e'^'’. 

v»l  V*!  v*l 

I 

Starting  from  k  *  2  one  can  show  by  Induction  that 

BA22  -  C'A12  -  CAI2  -  ^  [(Uyay-Uyay)(Uyb^y-UybyCy) 

K  vK  k 

+  UyUy(Cyay-Cyay)(by-by)] 

[  (  ^  yU  y— C  yU  y  )  (  U  ya  yb  y~U  ya  yb  y) 

l<u<^<k) 

+  2UyUy  (Cyay— CySy)  (by” by)]  (  C”  7 ) 

after  addition  and  subtraction  of  UyUy  (Cyay-Cyay)(by-by) ,  and 

All  A22  ■  2A12A12  ■  2  S  UyUy  (Uyay-Uyay)  (by-by).  (C-8) 

l<u<v<k 

The  derivative  of  ai  6)  can  now  be  written  as 

a'  -  D"2  {[2^(uyay-Uyay)^[]^(CyUy-c^y)(Uyayby-Uyayby)] 

^  +2  yS  y— Uyay)2j  U'^  V  (c  yS  v“C  yay)(by-by)j 

^  y^  y)  V®  U®  V~'^  V®  ^  ^  f  C— 9) 

summations  to  be  performed  over  1  £  u  <  v  £  k.  For  simplicity,  set 

kyv  “  “  U\/3y,  Syy  ”  CySy  ~  CySy 

Then,  using  different  subscript  pairs  for  clearer  distinction  between  the 
Individual  factors,  write  the  second  and  third  lines  in  this  expression  for 
o'  as  follows: 


2  [2^r2uv][2'^>c“X  SkXC^x  -  K'>] 

2  [2  RyyS Jjyj  j^U,^UxR<X  • 

Those  pairs  of  terms  cancel  for  which  the  subscript  pairs  (u,v)  and  (tc,X)  are 
equal.  The  remaining  terms  are  pairwise  of  the  form 

2R^yv'^K'^X^KX  “  2  RyySyyU^U^^R^j^  (bj^-b^),  (C-10) 

(u,v)  (ic,X). 

The  error  equations  (13)  of  the  least  squares  approximation  can  be 
written  as 

oruy  +  pay  -  Cy  -  Ey  (v-1,  k). 

Division  by  Sy  0  since  Uy  ^  0)  results  In 

Uy  Cy  £y 

a  —  +  0 - -  — 

ay  ay  ay 

and  consequently, 

-  f:id\  + 

ay 

Multiplication  by  a^ay  leads  to 

®kyv  "  ^liv  Ryv  (1  ^  U  ^  ^  S. 
with  EySy  -  EySy  «  Eyy.  Theo  (C-10)  changes  Into 

2R^yv'*<'^X  ( '^RKX~R<x)(^X~^ic)”’2Ryy  (oRyy-Eyy)  u^uxR|cX  (^X”^ic) 

”  2RyyU,fUx(bx~b,f ) [EyyRjfX  ""  RicX^uvl  *  (C-11) 

If,  for  some  R  >  0,  the  points  Py  ■  (Uy,Vy)  should  all  be  located  on  the  graph 
of  the  function  v*(u),  then  Ey  would  be  zero  for  v«l,...,k.  This  would  mean 
Eyy  *  0  for  1  £  u  <  V  £  k,  so  that  the  terms  In  (C-ll)  would  be  zero.  Hence, 
the  derivative  of  a(  R)  would  reduce  from  Its  general  form  (C-9)  to 

O*  “  D  ^  (CyUy— CyUy)(Uyayby  ”  Uyayby). 

l_<y<v<  k 


C-4 


In  general,  however,  the  approximation  errors  Sy  will  not  be  zero.  Then 

O  “  D  ^  (  C  jjUy~C  ylly )  (u  Jja  yb]J  ~  Uya|jby)  +  D  (C“12) 

l_<y<v<k 

where  E  represents  the  sum  of  all  terms  (after  cancellation)  which  are  due  to 
the  Ey's  not  being  zero. 

Now  establish  inequalities  for  the  factors  in  the  sum  in  (C-12).  First 
for  CyUy  -  CyUy!  If,  ss  boforo,  r^  »  CyUy”^,  r^  -  CyUy"l,  then  r^  -  r^  >  0 

if  the  points  Py  ,  Py,  Vg  satisfy  the  concavity  condition.  Multiplication  of 
the  last  inequality  by  UyUy  leads  to 


CyUy  ~  Cy 


/  >  0,  UyUy  >  0, 

“u  t  <  0,  UyUy  <  0. 


(C-13) 


Next  look  at 


Ru 


Uyayby  >  U  ya  yb  y  »  Uy  (  6  1)  «  ^"0  y  (C  )  6  ' V  . 

Set  8uy  *  X,  BUy  ■  y,  X  <  y,  X  0,  y  ^  0.  Furthermore,  set  y  -  ctx,  and 
obtain  the  function 

f(x)  *  g(x),  g(x)  “  e“ca-  1  -  a(e“X-l). 

With  X  replaced  by  -x,  the  function  g(x)  appearing  here  is  the  same  as  that  in 
(C-2).  Therefore, 


Uyayby  >  Uyayby  ’  V,  n 

I  >  0,  UyUy  <  0. 


This  inequality  and  (C-13)  show  that  the  sum  in  the  expression  (C-12)  for  a' 
Is  certainly  negative  if  the  points  Py  »  (uy,Vy)  are  concavely  located  rela¬ 
tive  to  each  other  with  respect  to  the  point  Vg  =  (0,v(0))  (or  at  least  those 
for  sufficiently  large  v).  Consequently,  if  the  error  term  0“^  jE  |  is  suf¬ 
ficiently  small,  l.e.,  if  the  approximation  of  the  points  Py  by  the  graph  of 
the  function  v*(u)  is  sufficiently  good,  the  derivative  o'  of  ct(B)  as  given 
in  (C-6)  will  still  be  negative. 


An  essential  practical  side  result  can  be  formulated  on  the  basis  of  the 
last  considerations.  The  class  of  distributions  (He  )  may  be  used  for  an  ana¬ 
lytical  fit  of  a  given  empirical  statistical  data  set  if  the  function  g(B)  is 
monotonically  decreasing. 

At  the  end  of  Appendix  D  this  version  of  the  applicability  criterion 
shall  be  reformulated  to  obtain  a  practically  more  useful  form. 


This  appendix  shall  be  finished  by  an  Investigation  of  the  limiting 
behavior  of  a(  R)  as  fi  f  +  <0  and  6  4-0.  The  function  o(R)  Is  defined  by  the 
expression 


a 


BA22  -  CA12 

AiiA22-a2^ 


(C-14) 


the  various  terms  being  given  In  (16),  As  fi  +  +  »,  (since  uj  <  u2  <  • • • 
<  ui5__3  <  0  <  uy^-2  <  uj^-i  <  u^), 

BA22  ”  CA12  ~  (B  -  uj5^cjj)a^  +  (terms  of  lower  order), 

A11A22  “  ^2^2  ~  (All  ~  ^  (terms  of  lower  order). 

Therefore,  provided  o'  Is  negative. 


O  +  Ooo 


B 

All-u2 


k-1 

2^  UyCy 

r— 1 -  >0,  R  +  +  00. 

k-1  2 

E  “V 

V“1 


As  R  +  0  the  following  Is  argued.  Since  the  numerator  and  the  denomina¬ 
tor  In  the  expression  (C-14)  for  o  both  go  to  zero  as  6  4  0,  the  Bernoulll-de 
L'Hospltal  rule  applies.  It  Is  used  three  times.  If,  In  the  first  step, 
the  expressions  (C-7)  and  (C-8)  are  used,  the  following  expressions  are 
derived  In  the  third  step  for  the  numerator  and  denominator,  respectively, 


2[UyUy(Uyby-Uyby)(UybyCy-UybyCy)  +  (  1 6  tTBS  WhlCh  gO  tO  0)]  , 

2  [  3UyU-y  (u^b-^— Ujjby)  (by— by)  +  UyUy(Uyay— UySy)  (u^yby— U^ybjj)]  , 


Clearly,  each  term  In  the  second  sum  approaches  zero  as  R  -V  0,  Since  by  0 
as  R  4  0,  the  first  term  In  the  first  sum  approaches  the  positive  value 
UyUy(CyUy  -  c yU y ) ( u y  “  Uy),  Coosequent  1  y ,  o  +  +  ®  as  R  4  0, 


C-6 


Convergence  of  the  Iteration  Process 


Two  functional  relations  have  been  established: 

o  =  'a(B),CT=a(fi)»0<f^<  +  “’-  (D“l) 

The  first  one  Is  Implicitly  defined  by  the  second  moment  equation  h(fi,o)  =  0; 
the  second  one  has  been  derived  from  a  least  squares  fit  of  given  log  cdf 
points.  The  properties  of  o(  6)  and  o(  B)  have  been  discussed  In  Appendixes  A 
and  C,  respectively.  The  parameter  determination  problem  has  exactly  one 
solution  Sq,  Oq  ■  1  ~  Po»  l^o*  only  If  there  exists  exactly  one  value 

>  0  such  that  (j(Bo)  “  5(Bo)*  !•«•.  geometrically  speaking,  If,  and  only  If, 
the  graphs  of  the  two  functions  In  (D-1)  have  exactly  one  point  of  Intersection. 

Suppose  now  that  there  is  exactly  one  Bq  >  0  such  that  a(8o)  ••  aCSg). 

Since  a(B)  is  not  explicitly  available,  use  Instead  of  (D-*!)  the  equivalent 
equations 


h(6,a)  ■  0,  o  -  ct(B),  (D-2) 

and  solve  them  Iteratively. 

From  the  least  squares  fit  by  means  of  v*(u)  with  the  starting  value 
8*1,  exactly  one  number  Is  obtained,  oj  ■  o(l)  ■  Di(l)/D(l).  Then  solve  the 
equation  h(B,cTi)  *  0,  Its  unique  solution  being  B^  and  face  the  trichotomy 

8l  ■  8i  >  1,  8i  <  1. 

(a)  If  Bi  ■  1,  the  iteration  process  through  (D-2)  produces  the  sequen¬ 
ces  (oyl  and  fBy}  with  »  aj.  By  ■  1  (v"l,2,...).  Consequently,  the  solu¬ 

tion  of  the  system  (D-2)  Is  Bg  ■  1,  <To  •  1  “  Po  *  orj. 

(b)  If  Bi  >  1,  sequences  ^Oyl  and  ffiyl  are  obtained  for  which  not  all 
elements  are  equal.  If  the  function  o(  R)  is  monotomically  decreasing 
(Appendix  C),  l.e..  If  the  given  data  are  not  "Ill-conditioned,"  In  the  second 
iteration  step  a  number  02  “  cr(Ri)  <  oi  *  a(l),  Is  obtained.  Since  h(Ri,crx)  = 
0»  b(R^,ci2)  <  0  (Appendix  C).  Therefore,  the  root  R2  of  h(R,o2)  =  0  satisfies 
the  inequality  R2  >  Ri-  These  arguments  apply  In  each  of  the  subsequent 
Iteration  steps.  Since  It  was  assumed  that  there  exists  only  one  value  Rg  for 
which  a(flg)  *  a(Rg),  the  sequence  fRy}  converges  to  Rg,  Ry  +  Sg  >  1  as  v  +  +  «, 
and  the  sequence  foyl  converges,  Oy  +  Og  »  1  -  pg  as  v  +  +  ». 

(c)  If  R^  <  1  analogous  arguments  apply  to  establish  convergence  of  the 
sequences  fRyl  and  foy^  to  unique  limits  Rg  <  1,  Og  =•  1  -  pg,  respectively. 

In  practice,  of  course,  the  iteration  process  Is  stopped  whenever  a 
desired  accuracy  has  been  reached,  and  the  last  R-value  Is  taken  as  Rg. 

Convergence  of  the  Iteration  process  represents  the  ultimate  practical 
test  for  the  applicability  of  the  class  of  distributions  (>k).  If  the  Itera¬ 
tion  process  does  not  converge,  this  Is,  In  fact,  an  Indication  that  the 
given  empirical  data  set  Is  Ill-conditioned  and  that  the  class  ( jfr )  should 
not  be  applied. 
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